Search CORE

17 research outputs found

Multilingual search for cultural heritage archives via combining multiple translation resources

Author: Debole Franca
Fantino Fabio
Jones Gareth J.F.
Newman Eamonn
Zhang Ying
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2007
Field of study

The linguistic features of material in Cultural Heritage (CH) archives may be in various languages requiring a facility for effective multilingual search. The specialised language often associated with CH content introduces problems for automatic translation to support search applications. The MultiMatch project is focused on enabling users to interact with CH content across different media types and languages. We present results from a MultiMatch study exploring various translation techniques for the CH domain. Our experiments examine translation techniques for the English language CLEF 2006 Cross-Language Speech Retrieval (CL-SR) task using Spanish, French and German queries. Results compare effectiveness of our query translation against a monolingual baseline and show improvement when combining a domain-specific translation lexicon with a standard machine translation system

DCU Online Research Access Service

A Native XML Database Supporting Approximate Match Search

Author: Franca Debole
Publication venue
Publication date
Field of study

Abstract. XML is becoming the standard representation format for metadata. Metadata for multimedia documents, as for instance MPEG-7, require approximate match search functionalities to be supported in addition to exact match search. As an example, consider image search performed by using MPEG-7 visual descriptors. It does not make sense to search for images that are exactly equal to a query image. Rather, images similar to a query image are more likely to be searched. We present the architecture of an XML search engine where special techniques are used to integrate approximate and exact match search functionalities.

CiteSeerX

An Analysis of the Relative Hardness of Reuters-21578 Subsets

Author: Fabrizio Sebastiani
Franca Debole
Publication venue
Publication date
Field of study

The existence, public availability, and widespread acceptance of a standard benchmark for a given information retrieval (IR) task are beneficial to research on this task, because they allow different researchers to experimentally compare their own systems by comparing the results they have obtained on this benchmark.The Reuters-21578 test collection, together with its earlier variants, has been such a standard benchmark for the text categorization (TC) task throughout the last 10 years.However, the benefits that this has brought about have somehow been limited by the fact that different researchers have “carved ” different subsets out of this collection and tested their systems on one of these subsets only; systems that have been tested on different Reuters-21578 subsets are thus not readily comparable.In this article, we present a systematic, comparative experimental study of the three subsets of Reuters-21578 that have been most popular among TC researchers.The results we obtain allow us to determine the relative hardness of these subsets, thus establishing an indirect means for comparing TC systems that have, or will be, tested on these different subsets

CiteSeerX

Multilingual Search for Cultural Heritage Archives via Combining Multiple Translation Resources

Author: Debole Franca
Fantino Fabio
Jones Gareth J.F.
Newman Eamonn
Zhang Ying
Publication venue
Publication date: 01/06/2007
Field of study

CiteSeerX

Irish Universities

DCU Online Research Access Service

A Tutorial on the MILOS Multimedia Content Management System

Author: Claudio Gennaro
Fabrizio Falchi
Fausto Rabitti
Franca Debole
Paolo Bolettieri
Pasquale Savino
Publication venue
Publication date
Field of study

MILOS supports the storage and content based retrieval of any multimedia documents whose descriptions are provided by using arbitrary metadata models represented in XML. It provides developers of digital library applications with functionalities for dealing with heterogeneous digital documents, heterogeneous metadata, and metadata schema mapping. This paper shows how to configure and use all MILOS components

CiteSeerX